Two-Locus Likelihoods Under Variable Population Size and Fine-Scale Recombination Rate Estimation.
نویسندگان
چکیده
Two-locus sampling probabilities have played a central role in devising an efficient composite-likelihood method for estimating fine-scale recombination rates. Due to mathematical and computational challenges, these sampling probabilities are typically computed under the unrealistic assumption of a constant population size, and simulation studies have shown that resulting recombination rate estimates can be severely biased in certain cases of historical population size changes. To alleviate this problem, we develop here new methods to compute the sampling probability for variable population size functions that are piecewise constant. Our main theoretical result, implemented in a new software package called LDpop, is a novel formula for the sampling probability that can be evaluated by numerically exponentiating a large but sparse matrix. This formula can handle moderate sample sizes ([Formula: see text]) and demographic size histories with a large number of epochs ([Formula: see text]). In addition, LDpop implements an approximate formula for the sampling probability that is reasonably accurate and scales to hundreds in sample size ([Formula: see text]). Finally, LDpop includes an importance sampler for the posterior distribution of two-locus genealogies, based on a new result for the optimal proposal distribution in the variable-size setting. Using our methods, we study how a sharp population bottleneck followed by rapid growth affects the correlation between partially linked sites. Then, through an extensive simulation study, we show that accounting for population size changes under such a demographic model leads to substantial improvements in fine-scale recombination rate estimation.
منابع مشابه
Bayesian inference of fine-scale recombination rates using population genomic data.
Recently, several statistical methods for estimating fine-scale recombination rates using population samples have been developed. However, currently available methods that can be applied to large-scale data are limited to approximated likelihoods. Here, we developed a full-likelihood Markov chain Monte Carlo method for estimating recombination rate under a Bayesian framework. Genealogies underl...
متن کاملOn the Recombination Rate Estimation in the Presence of Population Substructure
As recombination events are not uniformly distributed along the human genome, the estimation of fine-scale recombination maps, e.g. HapMap Project, has been one of the major research endeavors over the last couple of years. For simulation studies, these estimates provide realistic reference scenarios to design future study and to develop novel methodology. To achieve a feasible framework for th...
متن کاملThe two-locus ancestral graph in a subdivided population: convergence as the number of demes grows in the island model.
We study the ancestral recombination graph for a pair of sites in a geographically structured population. In particular, we consider the limiting behavior of the graph, under Wright's island model, as the number of subpopulations, or demes, goes to infinity. After an instantaneous sample-size adjustment, the graph becomes identical to the two-locus graph in an unstructured population, but with ...
متن کاملDisequilibrium likelihoods for fine-scale mapping of a rare allele.
Genetic linkage studies based on pedigree data have limited resolution, because of the relatively small number of segregations. Disequilibrium mapping, which uses population associations to infer the location of a disease mutation, provides one possible strategy for narrowing the candidate region. The coalescent process provides a model for the ancestry of a sample of disease alleles, and recom...
متن کاملMathematical Biology
We study the ancestral recombination graph for a pair of sites in a geographically structured population. In particular, we consider the limiting behavior of the graph, under Wright’s island model, as the number of subpopulations, or demes, goes to infinity. After an instantaneous sample-size adjustment, the graph becomes identical to the two-locus graph in an unstructured population, but with ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Genetics
دوره 203 3 شماره
صفحات -
تاریخ انتشار 2016